{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# writing prompt augmentation data task"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LAION-AI/Open-Assistant/blob/main/notebooks/data-augmentation/writing-prompt/writing_prompt.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. ontocord"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use the prompts/story dataset from here: https://www.kaggle.com/datasets/ratthachat/writing-prompts. In addition to the prompts and story, augment with instructions such as “write a story about {prompt}, ending with the sentence {last_sentence}”. “write a story about {prompt}, where the beginning of the story is about {summary of the beginning part}”. “write a story about {prompt}, where the middle of the story is about {summary of the middle part}”. “write a story about {prompt}, where the end of the story is about {summary of the end part}”"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. fabraz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are some samples from writing prompts:\n",
    "\n",
    "|id|prompt|story|\n",
    "|--|--|--|\n",
    "|1|[ WP ] When you die , you do n't go to the afterlife of you 're religion , you go to the afterlife of the religion whose tenets you followed most closely , knowingly or not .|Thomas loves science fiction , and is pleased to find himself sitting by the park entrance with Arthur C. Clarke ’ s “ Fountains of Paradise ” open in his lap . He must have jogged there , he thinks to himself as he admires his brand new black-and-white Nikes . He stretches out in his black joggers and turns the page . “ But there was no substitute for reality , one should beware of imitations ” , he reads before shutting the book . <newline> <newline> Thomas ponders what he has read as he looks to the right ; not a single car can be seen . The street appears infinite in length and the buildings fade in to the distance with it . He stands and begins his first step down the street . <newline> <newline> His movement halts when he hears a young voice behind him , “ You look thirsty mister . Would you like some lemonade ? ” <newline> <newline> Thomas walks back past the park entrance and over to the lemonade stand , wondering how he had not noticed it before . It is beautiful , the entrance ; but the park is closed now . Thomas stares up at the gates in awe . <newline> <newline> Thomas is interrupted again by the child , “ $ 5.50 , please. ” <newline> <newline> Thomas looks at the counter , flustered . “ I ’ ll have the punch instead. ” <newline> <newline> As the child pours the purple drink in to the cup , Thomas reaches in his pocket finding a five dollar bill and three quarters . <newline> <newline> “ Keep the change ” , Thomas says as he picks up his drink . <newline> <newline> Thomas sips and the sky slowly dims . He feels his breath drawn away from him as a comet sails over the park entrance . And Heaven ’ s Gate opens . <newline>|\n",
    "|2|[ CW ] [ PM ] Write your hero into a corner , and let me get them out .|Bob dropped five of the Zeds , reloaded his Colt 45 , and ran up the stairs . <newline> <newline> He had someone currently upstairs , alerting Search and Rescue to find a place to land in this urban , industrial nightmare . They were currently in a truck depot , the places where goods would be transferred truck from truck . <newline> <newline> Already , some men defending the front door had been pulled in , causing the rest to fall back . The first , and only , line of physical defense , the hardened steel gates , created to stop robbers , were badly banged up , from the onslaught of fists against it . It was bad enough that the zombies managed to cram two at once inside the doorway , but losing the gates would mean that the horde would rush in . <newline> <newline> `` Hey ! '' Courtney rushed outside the communications office , her .22 rifle in hand . `` They 're at the trainstation , just a block from here ! '' <newline> <newline> `` It 's probably too late , mate . '' Bob said back , `` Just look at 'em ! '' <newline> <newline> The metal steps leading to the elevated walkway was a savior , only allowing one body to get in at a time . Unfortunately , our heroes had just fought their way here , from a few streets down . Seems easy ? Not when you have to take detours through heavily infested buildings because of blockades in the roads , or just the sheer number of walkers wouldn't 've allowed you to run through them . <newline> <newline> Bob 's equipped with a Colt 1911 .45 caliber pistol , excellent at punching through heads , but at the cost of heavy kickback . Also due to it 's temptingness , Bob has used all but three 7-round magazines . He has a knife , but who the hell would be able to take anyone out with that ? <newline> <newline> Courtney has her 10/22 Ruger Takedown . Initially intended for long range hunting , the rifle particularly excels at going through targets cleanly . The only disadvantage is the lack of stopping power . <newline> <newline> They have a fully gassed up FedEx truck at their disposal . A few men inside , surrounded , but armed , are ready to go when you tell them where they need to go . <newline> <newline> Around 31 zombies have gotten in already , with god knows how much outside .|\n",
    "|3|[ cw ] write about the strangest/scariest/saddest dream you 've ever had in less than 200 words .|The night was as thick and terrifying as any I had ever seen before . All I could hear was the scream of the wind past my ears , the pounding of hooves , huffed horse breaths , and the pounding of my own heart . <newline> <newline> The woods were closeknit , and my path was barely visible , hidden under a thick layer of bracken . <newline> <newline> `` Faster , '' I whispered as I dug my heels in . Safety was close and yet so far away , calling to me . He would save me ; I knew it with all my heart . <newline> <newline> All I had to do was outrun the demons at my back first .|\n",
    "\n",
    "Just in case anyone wants the [prompt tag description](https://www.reddit.com/r/WritingPrompts/wiki/how_to_tag_prompts/).\n",
    "\n",
    "@ontocord , can you improve the issue details having the samples above, please?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. ontocord"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Interesting how the [XX] tags are used. I wasn't thinking about those.\n",
    "\n",
    "I was thinking of Instructions -> answers like \"User: write me a story about {stripped_prompt} -> Rosey: Sure, here's a story about {stripped_prompt}: {story}\"\n",
    "where stripped_prompt removes things like \"write about\" \"in less than 200 words\", etc.\n",
    "\n",
    "And the inverse \"User: What is this story about {story} -> Rosey: I think it's about {striped_prompt}\"\n",
    "\n",
    "You could also do summarization of longer stories into 4 or 5 pointed sentences and ask for an outline. Or you could give an outline and ask Rosey to fill in the story.\n",
    "\n",
    "For the prompt tag, you could add constraings to the prompts based on the tag. So for [RF], you could add to the end of the actual instruciton: this story could {have happened before or should be able to happen in the real world to unknown people. Not what you think could happen in the future.}\n",
    "\n",
    "Lmk know if you need more input."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. ontocord"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also these instructions:\n",
    "“write a story about {prompt}, ending with the sentence {last_sentence}”. “write a story about {prompt}, where the beginning of the story is about {summary of the beginning part}”. “write a story about {prompt}, where the middle of the story is about {summary of the middle part}”. “write a story about {prompt}, where the end of the story is about {summary of the end part}”"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pipeline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal of this task was to auto-generate question/answer samples from writingPrompts to feed openAssistant. To do that we should standardize the way a prompt was written. Our choice was to set prompt templates which might turn the generation process feasible. Here are the templates we applied:\n",
    "\n",
    "* Base template: every prompt would have this sample.\n",
    "> User: write me a story about: {stripped_prompt} -> Rosey: Sure, here's a story about: {stripped_prompt}:\\n{story}\n",
    "\n",
    "where `stripped_promt` is the cleared prompt output by regex pattern to take out parts of a prompt that would not fit the template. And `story` is the actual answer to a prompt.\n",
    "\n",
    "* General constraints: a prompt whose constraint was found by regex pattern would have this also.\n",
    "> Base template, {stripped_constraint} -> Rosey: Sure, here's a story about: {stripped_prompt}, {stripped_constraint}:\\n{story}\n",
    "\n",
    "where `stripped_constraint` is the constraint found.\n",
    "\n",
    "* Answer beginning constraints: this constraint was imposed by the way the answer should start.  \n",
    "> Base template, starting with: {beggining} -> Rosey: Sure, here's a story about: {stripped_prompt}, starting with: {beggining}:\\n{story}\n",
    "\n",
    "where `beginning` is the first sentence of a story.\n",
    "\n",
    "* Answer end constraints: this constraint was imposed by the way the answer should end.  \n",
    "> Base template, ending with: {ending} -> Rosey: Sure, here's a story about {stripped_prompt}: ending with: {ending}\\n{story}\n",
    "\n",
    "where `ending` is the last sentence of a story.\n",
    "\n",
    "* Answer middle constraints: this constraint was imposed by the way the answer should have in its middle text.  \n",
    "> Base template, where the middle of the story is about: {middle} -> Rosey: Sure, here's a story about: {stripped_prompt}, where the middle of the story is about: {middle}:\\n{story}\n",
    "\n",
    "where `middle` is a summary of a story without the first and last sentence brought by a generative model\n",
    "\n",
    "To get the samples we used the following pipeline:\n",
    "\n",
    "* **Get data**: download from kaggle\n",
    "* **Pre-processing**: load data from entails source/taget (aka: prompt/story) by every split (train/valid/test) merging into one pandas dataframe, enhancing tit with tabular info about the sample tags.\n",
    "* **Triage prompts**: we pick prompts sorted by frequency, and we built regex pattern for some of them to extract a striped prompt and the related constraint.\n",
    "* **Split stories**: after removing story beginning and ending sentences, we applied a sentence sliding window to get stories middle summaries."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get data from Kaggle\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# helper functions\n",
    "import json\n",
    "\n",
    "\n",
    "def save_credentials(d):\n",
    "    with open(\"/root/.kaggle/kaggle.json\", \"w\") as outfile:\n",
    "        json.dump(d, outfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mv: cannot stat '/mnt/home/fabraz/kaggle.json': No such file or directory\n"
     ]
    }
   ],
   "source": [
    "# uncomment the following instructions, in case you want to save a .kaggle.json\n",
    "# d = {}\n",
    "# d['username'] = 'user'\n",
    "# d['key'] = 'key'\n",
    "#!mkdir ~/.kaggle\n",
    "# save_credentials(d)\n",
    "!mv ~/kaggle.json ~/.kaggle/\n",
    "!chmod 600 ~/.kaggle/kaggle.json"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install kaggle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/bin/bash: kaggle: command not found\n"
     ]
    }
   ],
   "source": [
    "!kaggle datasets download -d ratthachat/writing-prompts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Archive:  writing-prompts.zip\n",
      "  inflating: writingPrompts/README   \n",
      "  inflating: writingPrompts/test.wp_source  \n",
      "  inflating: writingPrompts/test.wp_target  \n",
      "  inflating: writingPrompts/train.wp_source  \n",
      "  inflating: writingPrompts/train.wp_target  \n",
      "  inflating: writingPrompts/valid.wp_source  \n",
      "  inflating: writingPrompts/valid.wp_target  \n"
     ]
    }
   ],
   "source": [
    "!unzip writing-prompts.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pre-processing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from IPython.display import display, HTML"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# helper functions\n",
    "import re\n",
    "\n",
    "\n",
    "def load_file(path, names):\n",
    "    with open(path, \"r\") as f:\n",
    "        lines = f.readlines()\n",
    "    return pd.DataFrame(lines, columns=names)\n",
    "\n",
    "\n",
    "def load_data():\n",
    "    tags = {\n",
    "        \"WP\": \"Writing Prompt\",\n",
    "        \"SP\": \"Simple Prompt\",\n",
    "        \"EU\": \"Established Universe\",\n",
    "        \"CW\": \"Constrained Writing\",\n",
    "        \"TT\": \"Theme Thursday\",\n",
    "        \"PM\": \"Prompt Me\",\n",
    "        \"MP\": \"Media Prompt\",\n",
    "        \"IP\": \"Image Prompt\",\n",
    "        \"PI\": \"Prompt Inspired\",\n",
    "        \"OT\": \"Off Topic\",\n",
    "        \"RF\": \"Reality Fiction\",\n",
    "    }\n",
    "\n",
    "    dfConcat = pd.DataFrame()\n",
    "    for split in [\"train\", \"valid\", \"test\"]:\n",
    "        df = load_file(f\"writingPrompts/{split}.wp_source\", [\"prompt\"])\n",
    "        for tag in tags.keys():\n",
    "            df[tag.lower()] = df[\"prompt\"].map(lambda x: check_tag(x, tag.lower()))\n",
    "        df[\"tagCounter\"] = df.iloc[:, [2, -1]].sum(axis=1)\n",
    "        df[\"splitLineIndex\"] = df.index\n",
    "        story = load_file(f\"writingPrompts/{split}.wp_target\", [\"story\"])\n",
    "        df[\"story\"] = story[\"story\"]\n",
    "        df[\"split\"] = split\n",
    "        dfConcat = pd.concat([dfConcat, df])\n",
    "    return dfConcat\n",
    "\n",
    "\n",
    "def check_tag(item, tag):\n",
    "    r = re.compile(r\"[\\(\\{\\[]\\s*[\\w]{2}\\s*[\\]\\}\\)]\\s*\")\n",
    "    m = r.findall(item.lower())\n",
    "    if len(m) > 0:\n",
    "        for group in m:\n",
    "            if tag in group:\n",
    "                return 1\n",
    "    return 0\n",
    "\n",
    "\n",
    "def show_data(df):\n",
    "    html_string = \"\"\"\n",
    "                <html>\n",
    "                  <head><title>HTML Pandas Dataframe with CSS</title></head>\n",
    "                  <link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/>\n",
    "                  <body>\n",
    "                    {table}\n",
    "                  </body>\n",
    "                </html>.\n",
    "                \"\"\"\n",
    "    df = df.replace(\"\\<newline\\>|\\< newline \\>|\\<new line\\>\", \"\\n\", regex=True)\n",
    "    df.style.set_properties(**{\"text-align\": \"left\"}).set_table_styles(\n",
    "        [dict(selector=\"th\", props=[(\"text-align\", \"left\")])]\n",
    "    )\n",
    "    html = df.to_html()\n",
    "    html_string = html_string.format(table=html)\n",
    "    html_string = (\n",
    "        html_string.replace(r\"\\n\", \"<br>\")\n",
    "        .replace(\"<td>\", '<td style=\"text-align:left\">')\n",
    "        .replace(\"<th>\", '<th style=\"text-align:left\">')\n",
    "    )\n",
    "    display(HTML(html_string))\n",
    "\n",
    "\n",
    "def get_samples(df, n, constraint=None, show=True):\n",
    "    samples = zip(df[\"prompt\"].iloc[:n, 0].index, df[\"prompt\"].iloc[:n, 0], df[\"story\"].iloc[:n, 0])\n",
    "    df = pd.DataFrame(samples, columns=[\"index\", \"prompt\", \"story\"])\n",
    "    if constraint is not None:\n",
    "        df = df[df[\"prompt\"].str.contains(constraint)]\n",
    "    return df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[ WP ] Leonardo DiCaprio in a fit of rage begins to torpedo his own career by deliberately acting poorly and taking on bad films . He finally wins an oscar for starring in Paul Blart : Mall Cop 3 .\n",
      "[ CW ] Kill the writer in first-person narrative .\n"
     ]
    }
   ],
   "source": [
    "!head -n2 writingPrompts/test.wp_source"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = load_data()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>prompt</th>\n",
       "      <th>wp</th>\n",
       "      <th>sp</th>\n",
       "      <th>eu</th>\n",
       "      <th>cw</th>\n",
       "      <th>tt</th>\n",
       "      <th>pm</th>\n",
       "      <th>mp</th>\n",
       "      <th>ip</th>\n",
       "      <th>pi</th>\n",
       "      <th>ot</th>\n",
       "      <th>rf</th>\n",
       "      <th>tagCounter</th>\n",
       "      <th>splitLineIndex</th>\n",
       "      <th>story</th>\n",
       "      <th>split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>[ WP ] You 've finally managed to discover the...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>So many times have I walked on ruins , the rem...</td>\n",
       "      <td>train</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>[ WP ] The moon is actually a giant egg , and ...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>-Week 18 aboard the Depth Reaver , Circa 2023-...</td>\n",
       "      <td>train</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>[ WP ] You find a rip in time walking through ...</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>I was feckin ' sloshed , mate . First time I e...</td>\n",
       "      <td>train</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                              prompt  wp  sp  eu  cw  tt  pm  \\\n",
       "0  [ WP ] You 've finally managed to discover the...   1   0   0   0   0   0   \n",
       "1  [ WP ] The moon is actually a giant egg , and ...   1   0   0   0   0   0   \n",
       "2  [ WP ] You find a rip in time walking through ...   1   0   0   0   0   0   \n",
       "\n",
       "   mp  ip  pi  ot  rf  tagCounter  splitLineIndex  \\\n",
       "0   0   0   0   0   0           0               0   \n",
       "1   0   0   0   0   0           0               1   \n",
       "2   0   0   0   0   0           0               2   \n",
       "\n",
       "                                               story  split  \n",
       "0  So many times have I walked on ruins , the rem...  train  \n",
       "1  -Week 18 aboard the Depth Reaver , Circa 2023-...  train  \n",
       "2  I was feckin ' sloshed , mate . First time I e...  train  "
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds.head(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(303358, 16)\n"
     ]
    }
   ],
   "source": [
    "print(ds.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['splitLineIndex', 'prompt', 'story', 'split'], dtype='object')"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds[ds[\"split\"] == \"test\"].iloc[:2, [13, 0, 14, -1]].columns"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Samples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "                <html>\n",
       "                  <head><title>HTML Pandas Dataframe with CSS</title></head>\n",
       "                  <link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/>\n",
       "                  <body>\n",
       "                    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th style=\"text-align:left\"></th>\n",
       "      <th style=\"text-align:left\">splitLineIndex</th>\n",
       "      <th style=\"text-align:left\">prompt</th>\n",
       "      <th style=\"text-align:left\">story</th>\n",
       "      <th style=\"text-align:left\">split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">0</th>\n",
       "      <td style=\"text-align:left\">0</td>\n",
       "      <td style=\"text-align:left\">[ WP ] You 've finally managed to discover the secret to immortality . Suddenly , Death appears before you , hands you a business card , and says , `` When you realize living forever sucks , call this number , I 've got a job offer for you . ''<br></td>\n",
       "      <td style=\"text-align:left\">So many times have I walked on ruins , the remainings of places that I loved and got used to.. At first I was scared , each time I could feel my city , my current generation collapse , break into the black hole that thrives within it , I could feel humanity , the way I 'm able to feel my body.. After a few hundred years , the pattern became obvious , no longer the war and damage that would devastate me over and over again in the far past was effecting me so dominantly . <br> It 's funny , but I felt as if after gaining what I desired so long , what I have lived for my entire life , only then , when I achieved immortality I started truly aging . <br> <br> 5 world wars have passed , and now they feel like a simple sickeness that would pass by every so often , I could no longer evaluate the individual human as a being of its own , the importance of mortals is merely the same as the importance of my skin cells ; They are a part of a mechanism so much more advanced , a mechanism that is so dear to my fallen heart a mechanism that I have seen fall and rise so many times , a mechanism that when lost all of which it had , had me loosing my will to live , for the first time in all of my thousands years of existence . <br> <br> Acceptance , something so important . a skill that has proved itself worthy dozens of times , an ability that looks so easy to achieve , a gift , that I was n't able to aquire in all my years , until now . When the ashes on the ground flew into the now empty air upon humanity 's fall , I felt as if all of it 's weight was crushing me . Ignorance took over and I searched years for a hope , a sign of the very same patterns that I used to watch reappear every hundred years , the very core of my will to exist that was now no more that I so strongly wish was . <br> <br> If you have ever wondered if silence can drive people crazy , it can.. <br> I ca n't feel my legs , I have walked for days , just to hear the sound of gravel , crushed bones , crushed buildings and crushed civilizations under my steps to keep my sanity.. until I remembered , the day in my far past . The day of my rebirth , I took out of my pocket a small plastic box , with nine buttons and a small glass window . I could n't believe this was our past , I could n't believe how far we have been able to progress and yet , be destroyed by our own violence . <br> I slowly dialed the number I was given , exactly 1729 years ago . <br> <br> I dropped a tear , a tear that was too slow to hit the ground as I got sucked into the darkness that emerged around me . <br> <br> A chill went through my spine as I saw my destiny rise above me , I could see the white teeth under the dark cloack ... <br> <br> `` You have finally arrived '' He projected into my mind , with the most chilling cold and unhuman voice . <br> <br> `` I 'm ready to obey '' I answered . I knew who was sitting infront of me , and it was time for me to obey him , after all these years of playing god , even I came to it . <br> <br> Funny is n't it ? Even by achieving immortality , death , is inescapable .<br></td>\n",
       "      <td style=\"text-align:left\">train</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">1</th>\n",
       "      <td style=\"text-align:left\">1</td>\n",
       "      <td style=\"text-align:left\">[ WP ] The moon is actually a giant egg , and it has just started to hatch .<br></td>\n",
       "      <td style=\"text-align:left\">-Week 18 aboard the Depth Reaver , Circa 2023- <br> <br> I walk about the dull gray halls , the artificial gravity making my steps feel almost as if they were on land . Almost . I glance out a window as I pass it by . There 's the sun , and there 's the moon right there . And , of course , there 's the Earth . I kinda miss it . Then again , space is pretty cool . It 's got some brilliant views , and the wifi is surprisingly good . Even countless miles away from the Earth , I can crush Silver noobs on CS GO . <br> <br> I pass by Dale Malkowitz , the head scientist on board . <br> <br> `` Evening , Dale , '' I say . <br> <br> `` What up , Danny ? '' he replies cordially . <br> <br> `` Nothin ' much . A little bored , I guess . '' <br> <br> He shakes his head in disbelief . `` I really , *really* do n't understand how you can be bored in space . '' <br> <br> `` Well hey , '' I say slightly defensively , `` Aside from the views , it 's kinda ... dull . And empty . And stuff . '' <br> <br> `` Whatever you say , Wittell , '' he says , not unkindly . Then he walks off . A few moments pass , and then I decide to look out the window right by me . As my eyes scan the inky blackness of space ( again ) , I notice something odd about the moon 's surface . It 's slightly ... cracked . <br> <br> `` Hey , Malkowitz ? '' I call out , `` You might wan na check this out ! '' <br> <br> He walks over to me casually , probably expecting nothing . `` What ? '' he asks , `` What do you see ? '' <br> <br> I point at the moon . His brow furrows . `` Huh ... I guess there 's something up with the surface . I 'll have to look into tha- '' <br> <br> Suddenly , the surface cracks a little more . We glance at each other , and then back at the moon , and then at each other again , and then back at the moon again . <br> <br> `` What 's going on ? '' I ask , alarmed . <br> <br> He 's silent for a minute or two , mouth hanging open . Then , he calls out : `` Janice ! Terry ! Johnny ! Get over here ! Something 's up with the moon . '' <br> <br> The other crewmates enter , unsure of what to expect . As their eyes lay upon the moon 's surface cracks , they widen . <br> <br> And , by coincidence , more cracks appear at that very moment . And then more . And more . And more . And more ... <br> <br> Little bits of the moon begin to float away , torn free of the rest of the surface . We all stare , speechless . And then ... it happens . It *happens* . <br> <br> The side of the moon facing us is ... torn away by a ... <br> <br> Human ... hand ? <br> <br> And we see ... <br> <br> A giant ... human face ? ! <br> <br> Surprisingly , I can hear my thoughts over my racing heart . *I ca n't help but feel as if I recognize that face ... from the ... * <br> <br> *Internet . * <br> <br> Suddenly , the great face 's lips move . <br> <br> Of course , none of us can actually *hear* it speak , because of the laws of space and whatnot . However , I can read its lips , and it appears to be saying : <br> <br> `` Are you sure about that ? ''<br></td>\n",
       "      <td style=\"text-align:left\">train</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "                  </body>\n",
       "                </html>.\n",
       "                "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_data(ds[ds[\"split\"] == \"train\"].iloc[:2][[\"splitLineIndex\", \"prompt\", \"story\", \"split\"]]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Valid"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "                <html>\n",
       "                  <head><title>HTML Pandas Dataframe with CSS</title></head>\n",
       "                  <link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/>\n",
       "                  <body>\n",
       "                    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th style=\"text-align:left\"></th>\n",
       "      <th style=\"text-align:left\">splitLineIndex</th>\n",
       "      <th style=\"text-align:left\">prompt</th>\n",
       "      <th style=\"text-align:left\">story</th>\n",
       "      <th style=\"text-align:left\">split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">0</th>\n",
       "      <td style=\"text-align:left\">0</td>\n",
       "      <td style=\"text-align:left\">[ WP ] Every person in the world undergoes a `` goodness '' test . It 's designed to give a score from 1 to 200 , where 1 is pure evil , and 200 is an angel in human body . Then the world is divided into 200 zones , where people can live among their own kind .<br></td>\n",
       "      <td style=\"text-align:left\">Clancy Marguerian , 154 , private first class of the 150+ army , sits in his foxhole . Tired cold , wet and hungry , the only thing preventing him from laying down his rifle and walking towards the enemy lines in surrender is the knowledge that however bad he has it here , life as a 50-100 POW is surely much worse . He 's fighting to keep his eyes open and his rifle ready when the mortar shells start landing near him . <br> <br> He hunkers lower . <br> <br> After a few minutes under the barrage , Marguerian hears hurried footsteps , a grunt , and a thud as a soldier leaps into the foxhole . The man 's uniform is tan , he must be a 50-100 . <br> <br> The two men snarl and grab at eachother , grappling in the small foxhole . Abruptly , their faces come together . <br> <br> `` Clancy ? '' <br> <br> `` Rob ? '' <br> <br> Rob Hall , 97 , Corporal in the 50-100 army grins , as the situation turns from life or death struggle , to a meeting of two college friends . He lets go of Marguerian 's collar . <br> <br> `` Holy shit Clancy , you 're the last person I expected to see here '' <br> <br> `` Yeah '' <br> <br> `` Shit man , I did n't think I 'd ever see 'Mr . volunteers every saturday morning at the food shelf ' , not after The Reorganization at least '' <br> <br> `` Yeah Rob , it is something is n't it '' <br> <br> `` Man , I 'm sorry I tried to kill you there , hey , I heard you guys were out of food , here , you can share my dinner '' <br> <br> Clancy marvels , even after all this : The Reorganization , the coalitions , the war , Rob is still his old , chatty self . <br> <br> The two men sit , Rob chatting away , Clancy forcing out pleasantries . They pass Rob 's rations between them . <br> <br> <br> `` Clancy my man , I heard a group of terrorist 5 's took have formed some kind of cult , and they 're rallying all the &lt; 50 in their own coalition '' <br> <br> `` Oh yeah ? '' <br> <br> `` Yeah , I mean , that sucks and everything , cause those are some scary dudes , but I heard that there 's going to be a truce between our countries in a few days , why do n't we just hang out here , pretty soon we wo n't even be enemies anymore ! '' <br> <br> `` Yeah , Rob , that sounds like a plan '' <br> <br> `` Man , I 'm so glad I found you again , in a few days , this war will be over , and things will be cool between us and , hey , remember Sarah ? I heard she 's a 151 , maybe I 'll look her up , I 'll be sure to visit you too once I can get a pass to sector 150-155 , it 'll probably be tough though , even before the war , you had to do sooo much paperwork to be allowed to visit , I wonder if passes will even be reinstated after the truce ends , hey , did I ever tell you about the time ... '' <br> <br> Rob babbles as he dozes off , grinning up at Clancy . <br> <br> When Clancy is sure that his friend is asleep , he slits Rob 's throat with his bayonet . Clancy climbs out of the foxhole , and stumbles his way back to battalion HQ .<br></td>\n",
       "      <td style=\"text-align:left\">valid</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">1</th>\n",
       "      <td style=\"text-align:left\">1</td>\n",
       "      <td style=\"text-align:left\">[ WP ] Space mining is on the rise . The Space tanker Exxon Valdez 2.0 crash and spill its cargo . Write a news story covering the event .<br></td>\n",
       "      <td style=\"text-align:left\">„… and the little duckling will never be able to walk again. ” <br> <br> The artificial intelligence paused a moment for dramatic effect before continuing with its broadcast with a different voice . <br> <br> “ What a hearth breaking story , Frank . But now to another story that may leave you feel equally dirty . The automated space tanker Exxon Valdez 2.0 collided with an asteroid on its way to the Jupiter moon Ganymede . According to the ship owner the ship is out of control and leaking its content into space. ” <br> <br> “ That ’ s right , Fred . And the content of the ship has it in it , as they say ” , the computer said in first voice again , “ The whole tanker was filled with ‘ biological waste products ’ coming from research and mining stations in the Kuiper Belt. ” <br> <br> “ Biological waste products ? You don ’ t mean ... ” <br> <br> “ Yes , Fred ! ” Dramatic pause . “ I am talking about poop . Lots of it . And apparently it ’ s spilling everywhere. ” <br> <br> “ Better call the plumbers , Frank. ” <br> <br> “ Not any time soon , Fred . A spokesperson of the ship owner stated and I quote – ‘ Space is kind of big and empty , we expect no one to care , so why should we ? ’ Apparently they will just build a new ship and be done with it. ” <br> <br> “ That ’ s one way not to deal with the problem . But why doesn ’ t the ship fly home ? Shouldn ’ t the AI on board be able to handle such a problem ? ” <br> <br> “ Well , the issue is that the part in charge to deal with asteroid impacts like that has been impacted by the asteroid. ” <br> <br> “ Ouch . Talk about a bad run. ” <br> <br> “ True , especially if you take the name of the ship in consideration. ” <br> <br> “ Oh ? Exxon Valdez 2.0 it was , isn ’ t that right , Frank ? ” <br> <br> “ You ’ re absolutely right , Fred . Did you know the ship was named after an infamous ship of the twentieth century back on old Earth ? Apparently the Exxon Valdez of old was used for transporting petroleum across the oceans of Earth . Petroleum , as some of our listeners might not know , was a brownish black , gooey liquid comprised of biological matter which was transformed under high pressure for millions of years . Quite ironically the Exxon Valdez was infamous for crashing and spilling its cargo. ” <br> <br> “ Well , talk about making a bad name for yourself . Now both ships will go down in history for spilling black gooey stuff where it doesn ’ t belong . Who had that bright idea for such a name anyway ? ” <br> <br> “ Well , Fred , the company made its first plunder by holding a naming contest on the internet. ” <br> <br> “ Oh , will they ever learn ? ” <br> <br> “ Apparently not , Fred . Predictably someone tried to make a joke out of it . A niche side of history role players got wind of the contest and made it its goal to get it named after the infamous Exxon Valdez . Apparently they thought it would be funny , and given the content both ships were ferrying around , they might have a point. ” <br> <br> “ Funny , indeed , Frank . What ’ s the name of the side ? ” <br> <br> “ Well , Fred , it ’ s called Reddit . The people there mostly talk in outdated lingo and memes and watch cat pictures back from a time when the internet only was local on Earth. ” <br> <br> “ Truly a herald of the dark ages. ” <br> <br> “ You might be right about that , Fred . I assume they just thought it was funny . I guess this happens , when you let the internet decide on things. ” <br> <br> “ Well , Frank , when you think about the content both ships were ferrying around , they might have been right . Embarrassing for the company , but funny for everyone else. ” <br> <br> “ It might get worse than that , Fred . Environmentalists are up in arms . They claim that the human waste products spilling out of the ship might collide with Jupiter ’ s moon Europa within the next few millennia and might contaminate the biospheres with Earth life . Apparently there are a lot of bacteria and the likes in poop and some might be able survive the harsh conditions of space and end up impacting on the restricted moon. ” <br> <br> “ Oh dear , Frank , does the Monolith know about it yet ? I am sure it won ’ t let us hear the end of it. ” <br><br></td>\n",
       "      <td style=\"text-align:left\">valid</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "                  </body>\n",
       "                </html>.\n",
       "                "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_data(ds[ds[\"split\"] == \"valid\"].iloc[:2][[\"splitLineIndex\", \"prompt\", \"story\", \"split\"]]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\n",
       "                <html>\n",
       "                  <head><title>HTML Pandas Dataframe with CSS</title></head>\n",
       "                  <link rel=\"stylesheet\" type=\"text/css\" href=\"df_style.css\"/>\n",
       "                  <body>\n",
       "                    <table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th style=\"text-align:left\"></th>\n",
       "      <th style=\"text-align:left\">splitLineIndex</th>\n",
       "      <th style=\"text-align:left\">prompt</th>\n",
       "      <th style=\"text-align:left\">story</th>\n",
       "      <th style=\"text-align:left\">split</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">0</th>\n",
       "      <td style=\"text-align:left\">0</td>\n",
       "      <td style=\"text-align:left\">[ WP ] Leonardo DiCaprio in a fit of rage begins to torpedo his own career by deliberately acting poorly and taking on bad films . He finally wins an oscar for starring in Paul Blart : Mall Cop 3 .<br></td>\n",
       "      <td style=\"text-align:left\">The wet marble floor pressed on his cheek like a thousand hands slapping his face frozen in time . Smattering piss of rain ignored his indignant mumblings . His eyes fluttered . Pins and needs ran from finger to shoulder as he pushed back against the floor , contorting his aching body into a cross legged position . Last night was bad . He gathered that . His routine dullness of though crept inwards from the edges of his mind toward the black mist that veiled his most recent memories . He struggled to recall whatever he could n't recall but only for a moment before he decided it probably was n't worth the effort . <br> He glanced around the room for a few minutes before concluding that he probably did n't know where he was . His investigation was n't entirely fruitless , he discovered a mostly full bottle of vodka . It was cheap but would definitely get the job done . Taking a few swigs made it childishly easy to ignore that gigantic black cloud of fog blotting out whatever the hell he did before he woke up . <br> There was a mirror in the room and for want of anything more interesting to study he gazed at himself . It was a game he 'd play with himself , glancing at the mirror and seeing if he could recognize the person looking back . If he did n't know better he 'd have guessed he was a very successful mattress salesman , or perhaps a bum who had managed to score some luck gambling . <br> His face was portly and unshaven , in that limbo place where it had been too many days without being clean and too few days to become a beard . His stomach was round but firm , like a basketball stuffed under a shirt and then semi deflated . The hair was long and unruly , receding far into the past . But his eyes were the giveaway . Looking closely enough at them he could still see an intensity . It was n't the sharp kind he carried in his youth but rather like a rusted dagger . Still sharp enough to cut . <br> `` DiCaprio . '' The curse rasped out of him in a choke . After all these years spent working on the hallmark channel and tv series based on mediocre movies he was still there . Despite his best efforts to bury himself under all of the alchol and drugs he was still in there . He thought for sure after the bankruptcy he 'd be done , but no that god damned rerelease of Titanic the royalties started pouring in and he could n't get rid of the money . Not even the live action version of the nut job could destroy him . <br> Cursing he hurled the bottle at the mirror but his wet hands slipped and instead of a shattering crash there was only a thud as the bottle bounced off the dry wall and rolled on the floor . <br> His rage thwarted by his impotence he slumped against the floor and finally noticed why there was rain coming into this room . <br> <br> The window was smashed . He looked at the bottle , confused . No , he had n't done that . At least not with the vodka . He looked back at the glass etched around the window sill and his eyes hung on the red that stained the jagged teeth . <br> <br> The headache crept back towards the front of his mind while the bloody glass pinned his eyes in place . What the fuck happened last night ?<br></td>\n",
       "      <td style=\"text-align:left\">test</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th style=\"text-align:left\">1</th>\n",
       "      <td style=\"text-align:left\">1</td>\n",
       "      <td style=\"text-align:left\">[ CW ] Kill the writer in first-person narrative .<br></td>\n",
       "      <td style=\"text-align:left\">It 's been three days since my boyfriend pissed off the neighbors . <br> <br> They had to be pissed , he called the police on them . The neighbors had been harboring a runaway criminal . We did n't live in a bad neighborhood , there were families and good people living here with solid steady jobs . They cared about their yards and such . But , there was a bad egg , our neighbors to the south of us were shady . We could hear them yelling at their dog many times a week . Strange smoke often came out of their house , and the lights in the garage were on at odd hours . We never had proof until now that our concerns are legitimate . <br> <br> The car the escaped criminal was driving had been parked at the neighbor 's house and my boyfriend decided he should turn them in . This lead to the police parking in front of *our* house , and watching them through our bedroom window for hours until they caught him . They had to know it was us . And it freaked me out . <br> <br> I had started tucking my pink taser in my jacket pocket when I took my miniature Yorkie out to go potty . My neighbor to the north , Jay , seemed to notice my tension , so when he saw me step outside , he 'd come out and chat with me . He 'd ask me about work , and talk to me about his latest construction jobs . Jay always pretend to be grabbing something out of his massive pick-up truck . It usually followed the same pattern - he grabs something out of his truck , sees me out with my dog , then starts in on how it baffles him how such a tiny dog was smarter than most of the people he worked with . We 'd both gripe about our jobs and laugh about stupid customers , chase the puppy down when she tried to go after squirrels , and then part ways until the next potty break . <br> <br> The sun was beginning to set when my dog started doing her potty dance by the door . I put on my jacket , slipped my taser in my pocket , and opened the door . She bolted out the door and went straight for the squirrel sniffing around the sidewalk . <br> <br> `` NO ! BAD GIRL , COME HERE ! '' The squirrel started running across the road and her tiny legs skittered out of it . I ran after her , swearing as I tripped over a crack in the road . I felt a snap in my ankle and I went down . <br> <br> The roar of a large pick-up engine was too close and I did n't know what to look at - my little dog bouncing across the neighbor 's lawn , or the tires that were n't slowing down fast enough . I chose neither and closed my eyes . <br> <br> The last thing I heard was the clatter of of work boots and Jay voice cracking , `` Oh god , oh god , oh god ... '' <br> <br><br></td>\n",
       "      <td style=\"text-align:left\">test</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "                  </body>\n",
       "                </html>.\n",
       "                "
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "show_data(ds[ds[\"split\"] == \"test\"].iloc[:2][[\"splitLineIndex\", \"prompt\", \"story\", \"split\"]]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Augmentation "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Triage Prompts\n",
    "\n",
    "1. Take the prompts list order by frequency\n",
    "2. Define regex patterns for prompt and constraint\n",
    "3. Generate prompts"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_rep = ds.groupby([\"prompt\", \"split\"]).size().reset_index().rename(columns={0: \"records\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_rep = df_rep[df_rep[\"records\"] > 20].sort_values([\"records\"], ascending=False)\n",
    "# _str = df_rep[df_rep['records']>20].sort_values(['records'], ascending=False).iloc[1,0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# df_rep[df_rep[\"split\"] == \"valid\"].iloc[1:3, 0]\n",
    "# topPrompts20Reps += df_rep[df_rep[\"split\"] == \"valid\"].iloc[1:3, 0].to_list()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[\"[ WP ] To get in Heaven , you have to confront the person who you hurt the most . You were expecting an ex , your parents/relatives , or a friend . You did n't expect to see yourself .\\n\",\n",
       " \"[ WP ] You are born without emotions ; to compensate this , you started a donation box where people could donate their unwanted emotions . You 've lived a life filled with sadness , fear and regret until one day , someone donates happiness .\\n\",\n",
       " \"[ WP ] You are a teenager with the ability to measure how `` Dangerous '' people are on a scale from 1 to 10 just by looking at them . A normal child would be a 1 , while a trained man with an assault rifle might be a 7 . Today , you notice the unassuming new kid at school measures a 10 .\\n\",\n",
       " '[ WP ] You live in a world where every person receives a superpower on their 18th birthday . You eagerly count down the seconds then shriek in horror as you are given a power no one would ever want to be stuck with .\\n']"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "topPrompts20Reps"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "topPrompts20Reps = df_rep[df_rep[\"records\"] > 20].sort_values([\"records\"], ascending=False)[\"prompt\"].tolist()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "We found 1016 prompts having more than 20 stories\n"
     ]
    }
   ],
   "source": [
    "print(f\"We found {len(topPrompts20Reps)} prompts having more than 20 stories\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "PROMPT_PATTERNS = \"(Lucifer\\snever[\\s\\w,]+)|\\\n",
    "([\\. \\w,]+)\\.\\s+Tell me|\\\n",
    "(All injuries[\\. \\w,]+)\\.|\\\n",
    "(?<!\\])(At your[\\. \\w,]+)\\.|\\\n",
    "Daily Prompt \\: ([\\. \\w,]+)|\\\n",
    "In 100 words or less , ([\\. \\w,]+)\\.|\\\n",
    "(Last words/thoughts[\\. \\w,]+)\\.|\\\n",
    "(Magic is Hereditary.*) \\[|\\\n",
    "word limit (\\) [\\. \\w,\\/]+) \\.|\\\n",
    "(Make me love the person you love)|\\\n",
    "(Pack a punch) in 150 words|\\\n",
    "(The last man on earth[\\. \\w,\\/]+kill himself)|\\\n",
    "(The year is 2352 [\\. \\w,\\/'-]+)\\.|\\\n",
    "(A person dies[\\. \\w,\\/]+)\\.?|\\\n",
    "^[wW]rite a story([\\. \\w,\\/]+) |\\\n",
    "^[wW]rite about ([\\. \\w,\\/-]+)\\.?|\\\n",
    "^Writing Prompt (?:\\: [wW]rite|\\\n",
    "\\[ WP \\]) ([\\. \\w,\\/']+) ?|\\\n",
    "^(You 're a[\\. \\w,\\/']+)|\\\n",
    "(You 're moments[\\. \\w,\\/']+)\\.|\\\n",
    "(Describe the room you [\\. \\w\\/']+)|\\\n",
    " (Get me hooked \\. [ \\w,\\/']+)|\\\n",
    "[\\. \\w\\/',\\`]+ , (tell a horror story)|\\\n",
    "(Make me cry)|\\\n",
    "(Make me hate your character)|\\\n",
    "(Most responses on here have a twist[\\. \\w\\/',\\`;]+)|\\\n",
    "(Pick your favorite[\\(\\)\\. \\w\\/',\\`;]+beginning)|\\\n",
    "(Start your story[\\(\\)\\. \\w\\/',\\`;]+meanings \\.)|\\\n",
    "(The [\\. \\w\\/',\\`;]+ reader)|\\\n",
    "(Two people[\\. \\w,\\/']+bench)|\\\n",
    "Write (a gruesome story)|\\\n",
    "Write (a möb[\\. \\w,\\/']+story) that|\\\n",
    "(Write the letter [ ,\\w]+) |\\\n",
    "There is no prompt[ \\.\\w]+(you[ \\.\\w']+\\.)|\\\n",
    "(A peaceful alien race[ \\.\\w'-]+)\\.|\\\n",
    "(This is the prologue[\\(\\) \\.\\w'-]+)\\.|\\\n",
    "Write a short story where (the first[\\(\\) \\.\\w'-,]+)\\.|\\\n",
    "(Write the first and last paragraph[\\(\\) \\.\\w'-,]+)\\.|\\\n",
    "(Killing Hitler has[\\(\\) \\.\\w'-,\\?]+)|\\\n",
    "(You live in a city full[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "\\`\\` She said she loved him . [\\`'\\(\\) \\.\\w'-,\\?\\#]+\\.|\\\n",
    "(A soldier on the front dies[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You discover a grand hall[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(A boy asks a girl out . It 's high[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(When everyone turns 18 , they receive a pet[\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(To get in Heaven , you have to [\\/\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You are born without emotions [;\\/\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You are a teenager with the ability[\\`;\\/\\(\\) \\.\\w'-,\\?\\#]+)|\\\n",
    "(You live in a world where every person [\\`;\\/\\(\\) \\.\\w'-,\\?\\#]+)\"\n",
    "\n",
    "\n",
    "CONST_PATTERNS = \"Daily Prompt \\: [\\. \\w,]+\\[ ([\\. \\w,\\:]+)|\\\n",
    "(In 100 words or less) , ([\\. \\w,\\:]+) \\.|\\\n",
    "Make a story \\( ([\\. \\w,\\:]+) |\\\n",
    "Pack a punch (in 150 words)|\\\n",
    "Describe the room you [\\. \\w\\/']+([\\. \\w,\\:\\/]+)\\.|\\\n",
    "Get me hooked \\. Reel me in \\. ([\\. \\w\\/',\\`]+)\\.|\\\n",
    " ([\\. \\w\\/',\\`]+) , tell a horror story|\\\n",
    "Make me cry ([ \\w\\/',\\`]+).?|\\\n",
    "(in 150 words or less)|\\\n",
    "Pick your favorite[\\(\\)\\. \\w\\/',\\`;]+beginning \\. ([ \\w\\/',\\`]+)|\\\n",
    "Start your story[\\(\\)\\. \\w\\/',\\`;]+meanings \\.([ \\w\\/',\\`]+\\.)|\\\n",
    "The [\\. \\w\\/',\\`;]+ reader ,([\\. \\w\\/',\\`;]+)|\\\n",
    "Two people[\\. \\w,\\/']+bench \\. ([\\. \\w,\\:]+)|\\\n",
    "Write a gruesome story ([\\. \\w,\\:]+)|\\\n",
    "Write a möb[\\. \\w,\\/']+story (that[\\. \\w,\\/']+)\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add summary columns to data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install spacy -qqq"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We aim to augment data as following:\n",
    "* Prompt: \n",
    "  * whole\n",
    "  * + constraints\n",
    "* Story:\n",
    "  * whole\n",
    "  * beginning\n",
    "  * middle - sliding window summarized\n",
    "  * end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Summarization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#!pip install transformers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# @markdown utils\n",
    "from transformers.utils.logging import set_verbosity\n",
    "\n",
    "set_verbosity(40)\n",
    "\n",
    "import warnings\n",
    "\n",
    "# ignore hf pipeline complaints\n",
    "warnings.filterwarnings(\"ignore\", category=UserWarning, module=\"transformers\")\n",
    "warnings.filterwarnings(\"ignore\", category=FutureWarning, module=\"transformers\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from transformers import pipeline\n",
    "\n",
    "summarizer = pipeline(\n",
    "    \"summarization\",\n",
    "    \"pszemraj/long-t5-tglobal-base-16384-book-summary\",\n",
    "    device=0 if torch.cuda.is_available() else -1,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "params = {\n",
    "    \"max_length\": 1024,\n",
    "    \"min_length\": 8,\n",
    "    \"no_repeat_ngram_size\": 3,\n",
    "    \"early_stopping\": False,\n",
    "    \"repetition_penalty\": 3.5,\n",
    "    \"length_penalty\": 0.3,\n",
    "    \"encoder_no_repeat_ngram_size\": 3,\n",
    "    \"num_beams\": 4,\n",
    "}  # parameters for text generation out of model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Interpolation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import spacy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# helper functions\n",
    "\n",
    "import re\n",
    "\n",
    "\n",
    "def extract_prompt_parts(prompt, pattern):\n",
    "    \"\"\"\n",
    "    takes a prompt and some parts that matches to patern\n",
    "    \"\"\"\n",
    "    pattern = pattern.replace(\"\\\\\\n\", \"\\\\\")\n",
    "    if m := re.search(pattern, prompt, re.IGNORECASE):\n",
    "        if len(m.groups()) > 0:\n",
    "            return m.group(0)\n",
    "    return None\n",
    "\n",
    "\n",
    "from spacy.lang.en import English\n",
    "\n",
    "\n",
    "def get_sentences(_str):\n",
    "    chunks = _str.split(\"\\n\")\n",
    "    sentences = []\n",
    "    nlp = English()\n",
    "    nlp.add_pipe(\"sentencizer\")\n",
    "    for chunk in chunks:\n",
    "        doc = nlp(chunk)\n",
    "        sentences += [sent.text.strip() for sent in doc.sents]\n",
    "    return sentences\n",
    "\n",
    "\n",
    "from itertools import islice\n",
    "\n",
    "\n",
    "def window(seq, n=2):\n",
    "    it = iter(seq)\n",
    "    result = tuple(islice(it, n))\n",
    "    if len(result) == n:\n",
    "        yield \" \".join(result)\n",
    "    for elem in it:\n",
    "        result = result[1:] + (elem,)\n",
    "        yield \" \".join(result)\n",
    "\n",
    "\n",
    "def extract_story_parts(story):\n",
    "    sentences = get_sentences(story)\n",
    "    beginning = sentences.pop(0)\n",
    "    middles = window(sentences, 4)\n",
    "    ending = sentences.pop(-1)\n",
    "    return beginning, middles, ending\n",
    "\n",
    "\n",
    "def clear_prompt(prompt):\n",
    "    return re.sub(r\"^[Ww]rite \", \"\", prompt)\n",
    "\n",
    "\n",
    "def get_sample_dict(split, id, text):\n",
    "    return {\"split\": split, \"splitLineIndex\": id, \"text\": text}\n",
    "\n",
    "\n",
    "def generate_instruction_diologs(df):\n",
    "    dialogs = []\n",
    "    \"\"\"User: What is this story about: {story} -> Rosey: I think it's about: {striped_prompt}\"\"\"\n",
    "    dialogBase = \"\"\"User: write me a story about: {stripped_prompt}\"\"\"\n",
    "    dialog1 = \"\"\" -> Rosey: Sure, here's a story about: {stripped_prompt}:\\n{story}\"\"\"\n",
    "    dialog2 = \"\"\", {stripped_constraint} -> Rosey: Sure, here's a story about: {stripped_prompt}, {stripped_constraint}:\\n{story}\"\"\"\n",
    "    dialog3 = \"\"\", starting with: {beggining} -> Rosey: Sure, here's a story about: {stripped_prompt}, starting with: {beggining}:\\n{story}\"\"\"\n",
    "    dialog4 = \"\"\", ending with: {ending} -> Rosey: Sure, here's a story about {stripped_prompt}: ending with: {ending}\\n{story}\"\"\"\n",
    "    dialog5 = \"\"\", where the middle of the story is about: {middle} -> Rosey: Sure, here's a story about: {stripped_prompt}, where the middle of the story is about: {middle}:\\n{story}\"\"\"\n",
    "\n",
    "    df_rep = df.groupby([\"prompt\"]).size().reset_index().rename(columns={0: \"records\"})\n",
    "    df_rep.sort_values([\"records\"], ascending=False, inplace=True)\n",
    "    pbar = tqdm()\n",
    "    pbar.reset(total=len(df_rep))\n",
    "    for prompt in df_rep.iloc[:, 0]:\n",
    "        strippedPrompt = extract_prompt_parts(prompt, PROMPT_PATTERNS)\n",
    "        if strippedPrompt is None:\n",
    "            continue\n",
    "        strippedPrompt = clear_prompt(strippedPrompt)\n",
    "        strippedConstraint = extract_prompt_parts(prompt, CONST_PATTERNS)\n",
    "\n",
    "        for row in df[df[\"prompt\"] == prompt].itertuples():\n",
    "            try:\n",
    "                story = (\n",
    "                    row.story.replace(\"<newline>\", \"\\n\")\n",
    "                    .replace(\"< newline >\", \"\\n\")\n",
    "                    .replace(\"<new line>\", \"\\n\")\n",
    "                    .strip()\n",
    "                )\n",
    "                beginning, middles, ending = extract_story_parts(story)\n",
    "                dialogBeg = dialogBase.format(stripped_prompt=strippedPrompt)\n",
    "                dialog = dialogBeg + dialog1.format(story=story, stripped_prompt=strippedPrompt)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                if strippedConstraint is not None:\n",
    "                    dialog = dialogBeg + dialog2.format(\n",
    "                        stripped_prompt=strippedPrompt, stripped_constraint=strippedConstraint, story=story\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                dialog = dialogBeg + dialog3.format(stripped_prompt=strippedPrompt, story=story, beggining=beginning)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                dialog = dialogBeg + dialog4.format(stripped_prompt=strippedPrompt, story=story, ending=ending)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                middlesSumarizered = summarizer(middles, **params)\n",
    "                for middle, sumarizedMiddle in zip(middles, middlesSumarizered):\n",
    "                    # dialogs.append(dialogBeg + dialog5.format(stripped_prompt=strippedPrompt, story=story, middle=middle))\n",
    "                    dialog = dialogBeg + dialog5.format(\n",
    "                        stripped_prompt=strippedPrompt, story=story, middle=sumarizedMiddle[0][\"summary_text\"]\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitIndex, dialog))\n",
    "                pbar.update()\n",
    "            except Exception as e:\n",
    "                print(f\"{row.split}/{row.splitIndex}\")\n",
    "                raise e\n",
    "        pbar.refresh()\n",
    "    return dialogs\n",
    "\n",
    "\n",
    "def filter_data(\n",
    "    dataset,\n",
    "    negativeTagFilter=None,\n",
    "    positiveTagFilter=None,\n",
    "    patternFilter=None,\n",
    "):\n",
    "    \"\"\"\n",
    "    > filter_data(dataset['train'],negativeTagFilter=['ip'], positiveTagFilter=['pm'] )\n",
    "    \"\"\"\n",
    "    prompt = dataset[\"prompt\"]\n",
    "    if negativeTagFilter is not None:\n",
    "        prompt = prompt[(prompt[negativeTagFilter] < 1).any(axis=1)]\n",
    "    if positiveTagFilter is not None:\n",
    "        prompt = prompt[prompt[positiveTagFilter].gt(0).all(axis=1)]\n",
    "    if patternFilter is not None:\n",
    "        prompt = prompt[prompt[\"prompt\"].str.contains(patternFilter)]\n",
    "    story = dataset[\"story\"]\n",
    "    story = story.iloc[prompt.index]\n",
    "    return {\"prompt\": prompt, \"story\": story}\n",
    "\n",
    "\n",
    "def generate_instruction_diologs(prompt, df):\n",
    "    dialogs = []\n",
    "    \"\"\"User: What is this story about: {story} -> Rosey: I think it's about: {striped_prompt}\"\"\"\n",
    "    dialogBase = \"\"\"User: write me a story about: {stripped_prompt}\"\"\"\n",
    "    dialog1 = \"\"\" -> Rosey: Sure, here's a story about: {stripped_prompt}:\\n{story}\"\"\"\n",
    "    dialog2 = \"\"\", {stripped_constraint} -> Rosey: Sure, here's a story about: {stripped_prompt}, {stripped_constraint}:\\n{story}\"\"\"\n",
    "    dialog3 = \"\"\", starting with: {beggining} -> Rosey: Sure, here's a story about: {stripped_prompt}, starting with: {beggining}:\\n{story}\"\"\"\n",
    "    dialog4 = \"\"\", ending with: {ending} -> Rosey: Sure, here's a story about {stripped_prompt}: ending with: {ending}\\n{story}\"\"\"\n",
    "    dialog5 = \"\"\", where the middle of the story is about: {middle} -> Rosey: Sure, here's a story about: {stripped_prompt}, where the middle of the story is about: {middle}:\\n{story}\"\"\"\n",
    "\n",
    "    strippedPrompt = extract_prompt_parts(prompt, PROMPT_PATTERNS)\n",
    "    if strippedPrompt is not None:\n",
    "        strippedPrompt = clear_prompt(strippedPrompt)\n",
    "        strippedConstraint = extract_prompt_parts(prompt, CONST_PATTERNS)\n",
    "        pbar = tqdm(ascii=True, desc=\"stories\")\n",
    "        pbar.reset(total=len(df[df[\"prompt\"] == prompt]))\n",
    "        for row in df[df[\"prompt\"] == prompt].itertuples():\n",
    "            try:\n",
    "                story = (\n",
    "                    row.story.replace(\"<newline>\", \"\\n\")\n",
    "                    .replace(\"< newline >\", \"\\n\")\n",
    "                    .replace(\"<new line>\", \"\\n\")\n",
    "                    .strip()\n",
    "                )\n",
    "                dialogBeg = dialogBase.format(stripped_prompt=strippedPrompt)\n",
    "                dialog = dialogBeg + dialog1.format(story=story, stripped_prompt=strippedPrompt)\n",
    "                dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                if strippedConstraint is not None:\n",
    "                    dialog = dialogBeg + dialog2.format(\n",
    "                        stripped_prompt=strippedPrompt, stripped_constraint=strippedConstraint, story=story\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                beginning, middles, ending = extract_story_parts(story)\n",
    "                if beginning is not None:\n",
    "                    beginning, middles, ending = extract_story_parts(story)\n",
    "                    dialog = dialogBeg + dialog3.format(\n",
    "                        stripped_prompt=strippedPrompt, story=story, beggining=beginning\n",
    "                    )\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                    dialog = dialogBeg + dialog4.format(stripped_prompt=strippedPrompt, story=story, ending=ending)\n",
    "                    dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                    middlesSumarizered = summarizer(middles, **params)\n",
    "                    for middle, sumarizedMiddle in zip(middles, middlesSumarizered):\n",
    "                        # dialogs.append(dialogBeg + dialog5.format(stripped_prompt=strippedPrompt, story=story, middle=middle))\n",
    "                        dialog = dialogBeg + dialog5.format(\n",
    "                            stripped_prompt=strippedPrompt, story=story, middle=sumarizedMiddle[0][\"summary_text\"]\n",
    "                        )\n",
    "                        dialogs.append(get_sample_dict(row.split, row.splitLineIndex, dialog))\n",
    "                pbar.update()\n",
    "            except Exception as e:\n",
    "                print(f\"{row.split}/{row.splitLineIndex}\")\n",
    "                raise e\n",
    "            pbar.refresh()\n",
    "    return dialogs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Generate "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It saves parquet every `step` samples to avoid losing work. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## filter dataset to take only prompts with frequency greater than 20 stories.\n",
    "dialogs = []\n",
    "i = 0\n",
    "start = 0\n",
    "step = 10\n",
    "for index in range(start, len(topPrompts20Reps), step):\n",
    "    pbar = tqdm(ascii=True, desc=\"prompt\")\n",
    "    pbar.reset(total=len(topPrompts20Reps[index : index + step]))\n",
    "    for prompt in topPrompts20Reps[index : index + step]:\n",
    "        tmpDialogs = generate_instruction_diologs(prompt, ds)\n",
    "        if tmpDialogs is not None:\n",
    "            dialogs += tmpDialogs\n",
    "        pbar.update()\n",
    "    if len(dialogs) > 0:\n",
    "        pd.DataFrame(dialogs).to_parquet(\"writing-prompts-aug.parquet\")\n",
    "    pbar.refresh()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_parquet(\"writing-prompts-aug.parquet\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for split in list(set(df.split)):\n",
    "    df_aux = df[df[\"split\"] == split].iloc[:, 1:]\n",
    "    df_aux.reset_index(inplace=True)\n",
    "    df_aux.iloc[:, 1:].to_parquet(f\"{split}.parquet\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "01073391c27d455898ddec5e5b613840": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "02aff4fac4454967b80469f0774e1a6c": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "03209aedabd94b9f97c7ff186d61a1b5": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "03c75c2c3a674154aa1370081c8d2d0c": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "04eee7ef7947484c9a2fb9bb6ff14eec": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "04f0d4dafcee402780ab34cfba03179e": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_7390362a9704413984a47a1d5b262276",
       "IPY_MODEL_b263a25a96f547218983b9e62f2b841c",
       "IPY_MODEL_f8ac6fc3cf284b50bb54c6ade26db5a1"
      ],
      "layout": "IPY_MODEL_aca1b6be80124fd0999892577aee9f1e"
     }
    },
    "05cf82d369674d848d9d2dd50be546ad": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_949e1ca0688f4df39c6f0aee139a8a4b",
      "placeholder": "​",
      "style": "IPY_MODEL_666605f8ef614cc5806b7e2076095746",
      "value": " 27%"
     }
    },
    "09ad8cfb26814f979a82ac73f073d5c2": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_903c2a0ea90043d5ab9c6812ee118c1a",
       "IPY_MODEL_664e94791b1946e1a78bfa93e9ce0b6f",
       "IPY_MODEL_322330e98fc745df9b55a959392c015c"
      ],
      "layout": "IPY_MODEL_02aff4fac4454967b80469f0774e1a6c"
     }
    },
    "0ae446f572cd4bc5b6ac64e5f1aff216": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "0bba8f8e7f754d1eb204db2ceab4aaab": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "0d209a94698d43748bccb06629b1c97a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "14f3ee8a6fa943178e949c45baa7683f": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "161a4ed9fcd04fee984704a6666f5399": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "1924812f3b644648ae3671cb1f8f659f": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "1a774659596145c48dfd1703664ffbaa": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "1c9c8492343e4a86b3977b41abf2c91c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "20beb9b7ad504afba558ed28b6fb242b": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_9a369f1da9f94552960bb42bc895fb4a",
       "IPY_MODEL_4f372c13f77245c49925981c33d1d611",
       "IPY_MODEL_d0381de0ca3a4359a0d2c393e9f64f69"
      ],
      "layout": "IPY_MODEL_c50b53014ab44ef4b196b1a79c1ad61c"
     }
    },
    "2102cf1d8f6b4192b6d45dfbe4e5044d": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "227455685ef746a4845020529c86aca2": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "28085d8a3b4341e5bac2ce7efd9d89d5": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_92fb3795816548ffb336749cf590d335",
      "max": 2422362,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_3402b3a652254e90b3d6ef17dccfe90a",
      "value": 2422362
     }
    },
    "2e3b3d799b5b461d91fb4b2fa64ea7be": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "322330e98fc745df9b55a959392c015c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_7c9fbbe9addd4d4a82a0e7f2a9410af2",
      "placeholder": "​",
      "style": "IPY_MODEL_c4b3a987b2eb4d81a209fe62f1f00459",
      "value": " 792k/792k [00:00&lt;00:00, 10.6MB/s]"
     }
    },
    "33fc1d4498574c1e86e7c336ab3c4a9d": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "3402b3a652254e90b3d6ef17dccfe90a": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "3612d9a6e93348d6b7b98ca7d611eec4": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_1924812f3b644648ae3671cb1f8f659f",
      "placeholder": "​",
      "style": "IPY_MODEL_0bba8f8e7f754d1eb204db2ceab4aaab",
      "value": "Downloading (…)lve/main/config.json: 100%"
     }
    },
    "3dce9197ad544ff2be48248640298d38": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "43b109811c7d42089713ad5c327afc9d": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "43f709c83c424926b92e36acc3c95e1a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "473e0749bada493b90253b7c0a816e59": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "4b9d6ee49ebd4c018d01f8a64fb112e1": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_05cf82d369674d848d9d2dd50be546ad",
       "IPY_MODEL_ce7a65dfe8a04e29b8512044fe994b87",
       "IPY_MODEL_e639d6f2dafd4897a9a5df658cdf68b0"
      ],
      "layout": "IPY_MODEL_0d209a94698d43748bccb06629b1c97a"
     }
    },
    "4cfd63abaee74a1babed15ecc1ee834a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "4f372c13f77245c49925981c33d1d611": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_8b28758bfe16428ca933c9100b7a8b29",
      "max": 2361,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_983d2b1c0515441db135aae6dd217c41",
      "value": 2361
     }
    },
    "5d2a8c459bfc4e24be21ceef1ec86ae0": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "62113a2cac0d499b9acf2a89f1993f9a": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "664e94791b1946e1a78bfa93e9ce0b6f": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_2102cf1d8f6b4192b6d45dfbe4e5044d",
      "max": 791656,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_d07b4780b79340c8950e3f12c4d70820",
      "value": 791656
     }
    },
    "666605f8ef614cc5806b7e2076095746": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "7390362a9704413984a47a1d5b262276": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_c3f48d9f38b8419aae37d33b4968c2f5",
      "placeholder": "​",
      "style": "IPY_MODEL_814a8be16bbd4c499b23e931155c6169",
      "value": "Downloading (…)cial_tokens_map.json: 100%"
     }
    },
    "78188eb50348434e92dc947f6baae899": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_e33fd695d0af44dcb068cd168190ec03",
      "max": 1125,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_cf90c2cb43ae481baa3ef13417b1fc4b",
      "value": 1125
     }
    },
    "7c9fbbe9addd4d4a82a0e7f2a9410af2": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "7d29075f6e25436cb7fa531b4f1b92f0": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_03209aedabd94b9f97c7ff186d61a1b5",
      "placeholder": "​",
      "style": "IPY_MODEL_43b109811c7d42089713ad5c327afc9d",
      "value": " 1.12k/1.12k [00:00&lt;00:00, 39.0kB/s]"
     }
    },
    "7fa486f7cf6e41668382b57979928ecd": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "814a8be16bbd4c499b23e931155c6169": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "818222eaa6d64018b9058bcf6531b658": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_2e3b3d799b5b461d91fb4b2fa64ea7be",
      "placeholder": "​",
      "style": "IPY_MODEL_c41b65885a7b46d8b205b7db8e123cf4",
      "value": "Downloading (…)/main/tokenizer.json: 100%"
     }
    },
    "83ad5f094e684a33b03a28fb7b54f1cc": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_818222eaa6d64018b9058bcf6531b658",
       "IPY_MODEL_28085d8a3b4341e5bac2ce7efd9d89d5",
       "IPY_MODEL_ca32d31fb99e4b5990ba6fd33d3e1915"
      ],
      "layout": "IPY_MODEL_62113a2cac0d499b9acf2a89f1993f9a"
     }
    },
    "865eeaa12f9d4ecbb5e38b2b3baaa4cd": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_feefc865187648db9956cafc5914b123",
       "IPY_MODEL_eb27214d49314527aa99ab65e62ac529",
       "IPY_MODEL_a87f3e961e0d486d81bebec195b396a5"
      ],
      "layout": "IPY_MODEL_01073391c27d455898ddec5e5b613840"
     }
    },
    "876761d92c4a47558604f24826fbf276": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "88b58ed1580c4cf195963010c20d5454": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "88c37802c3914ae6ab3e2cff32cfbe87": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HBoxModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HBoxModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HBoxView",
      "box_style": "",
      "children": [
       "IPY_MODEL_3612d9a6e93348d6b7b98ca7d611eec4",
       "IPY_MODEL_78188eb50348434e92dc947f6baae899",
       "IPY_MODEL_7d29075f6e25436cb7fa531b4f1b92f0"
      ],
      "layout": "IPY_MODEL_88b58ed1580c4cf195963010c20d5454"
     }
    },
    "8b28758bfe16428ca933c9100b7a8b29": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "8bc9ac5c49a445e5b341513efaf58a83": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "903c2a0ea90043d5ab9c6812ee118c1a": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_1a774659596145c48dfd1703664ffbaa",
      "placeholder": "​",
      "style": "IPY_MODEL_1c9c8492343e4a86b3977b41abf2c91c",
      "value": "Downloading (…)&quot;spiece.model&quot;;: 100%"
     }
    },
    "92fb3795816548ffb336749cf590d335": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "949e1ca0688f4df39c6f0aee139a8a4b": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "983d2b1c0515441db135aae6dd217c41": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "9a369f1da9f94552960bb42bc895fb4a": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_227455685ef746a4845020529c86aca2",
      "placeholder": "​",
      "style": "IPY_MODEL_d65137c7ad444b38a2b8fcd1d36c1528",
      "value": "Downloading (…)okenizer_config.json: 100%"
     }
    },
    "a1e32f35ab1c4014aa6903ef043b469c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "a87f3e961e0d486d81bebec195b396a5": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_876761d92c4a47558604f24826fbf276",
      "placeholder": "​",
      "style": "IPY_MODEL_33fc1d4498574c1e86e7c336ab3c4a9d",
      "value": " 990M/990M [00:28&lt;00:00, 32.0MB/s]"
     }
    },
    "aca1b6be80124fd0999892577aee9f1e": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "b263a25a96f547218983b9e62f2b841c": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_43f709c83c424926b92e36acc3c95e1a",
      "max": 2201,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_3dce9197ad544ff2be48248640298d38",
      "value": 2201
     }
    },
    "c3f48d9f38b8419aae37d33b4968c2f5": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "c41b65885a7b46d8b205b7db8e123cf4": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "c4b3a987b2eb4d81a209fe62f1f00459": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "c50b53014ab44ef4b196b1a79c1ad61c": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "c6249ce38c8f437f9234faa7081743d4": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "c6da8ecbbf374f0d84e8704546a30c27": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "ca32d31fb99e4b5990ba6fd33d3e1915": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_4cfd63abaee74a1babed15ecc1ee834a",
      "placeholder": "​",
      "style": "IPY_MODEL_14f3ee8a6fa943178e949c45baa7683f",
      "value": " 2.42M/2.42M [00:01&lt;00:00, 1.53MB/s]"
     }
    },
    "ce7a65dfe8a04e29b8512044fe994b87": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_03c75c2c3a674154aa1370081c8d2d0c",
      "max": 1016,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_473e0749bada493b90253b7c0a816e59",
      "value": 274
     }
    },
    "cf90c2cb43ae481baa3ef13417b1fc4b": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "d0381de0ca3a4359a0d2c393e9f64f69": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_ddf56a6653304256bb61c8b69710fbec",
      "placeholder": "​",
      "style": "IPY_MODEL_0ae446f572cd4bc5b6ac64e5f1aff216",
      "value": " 2.36k/2.36k [00:00&lt;00:00, 111kB/s]"
     }
    },
    "d07b4780b79340c8950e3f12c4d70820": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "ProgressStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "ProgressStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "bar_color": null,
      "description_width": ""
     }
    },
    "d65137c7ad444b38a2b8fcd1d36c1528": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "ddf56a6653304256bb61c8b69710fbec": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "e33fd695d0af44dcb068cd168190ec03": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "e639d6f2dafd4897a9a5df658cdf68b0": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_161a4ed9fcd04fee984704a6666f5399",
      "placeholder": "​",
      "style": "IPY_MODEL_8bc9ac5c49a445e5b341513efaf58a83",
      "value": " 273/1016 [2:01:10&lt;8:24:02, 40.70s/it]"
     }
    },
    "eb27214d49314527aa99ab65e62ac529": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "FloatProgressModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "FloatProgressModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "ProgressView",
      "bar_style": "success",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_04eee7ef7947484c9a2fb9bb6ff14eec",
      "max": 990446387,
      "min": 0,
      "orientation": "horizontal",
      "style": "IPY_MODEL_a1e32f35ab1c4014aa6903ef043b469c",
      "value": 990446387
     }
    },
    "f8ac6fc3cf284b50bb54c6ade26db5a1": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_c6249ce38c8f437f9234faa7081743d4",
      "placeholder": "​",
      "style": "IPY_MODEL_5d2a8c459bfc4e24be21ceef1ec86ae0",
      "value": " 2.20k/2.20k [00:00&lt;00:00, 119kB/s]"
     }
    },
    "feefc865187648db9956cafc5914b123": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "HTMLModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "HTMLModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "HTMLView",
      "description": "",
      "description_tooltip": null,
      "layout": "IPY_MODEL_c6da8ecbbf374f0d84e8704546a30c27",
      "placeholder": "​",
      "style": "IPY_MODEL_7fa486f7cf6e41668382b57979928ecd",
      "value": "Downloading (…)&quot;pytorch_model.bin&quot;;: 100%"
     }
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
